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Abstract 

Background: Ihe bacterial genus Salmonella contains tliousands of serotypes tliat infect humans or other hosts, causing 
mild gastroenteritis to potentially fatal systemic infections in humans. Pathogenically distinct Salmonella serotypes have 
been classified as individual species or as serological variants of merely one or two species, causing considerable confusion 
in both research and clinical settings. This situation reflects a long unanswered question regarding whether the Salmonella 
serotypes exist as discrete genetic clusters (natural species) of organisms or as phenotypic (e.g. pathogenic) variants of a 
single (or two) natural species with a continuous spectrum of genetic divergence among them. Our recent work, based on 
genomic sequence divergence analysis, has demonstrated that genetic boundaries exist among Salmonella serotypes, 
circumscribing them into clear-cut genetic clusters of bacteria. 

Methodologies/Principal Findings: To further test the genetic boundary concept for delineating Salmonella into clearly 
defined natural lineages (e.g., species), we sampled a small subset of conserved genomic DNA sequences, i.e., the 
endonuclease cleavage sites that contain the highly conserved CTAG sequence such as TCTAGA for Xbal. We found that the 
CTAG-containing cleavage sequence profiles could be used to resolve the genetic boundaries as reliably and efficiently as 
whole genome sequence comparisons but with enormously reduced requirements for time and resources. 

Conclusions: Profiling of CTAG sequence subsets reflects genetic boundaries among Salmonella lineages and can delineate 
these bacteria into discrete natural clusters. 
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introduction 

Since the first isolation of a Salmonella pathogen from a typlioid 
patient in 1881, more than 2500 difTerent Salmonella types have 
been documented [1,2]. Based on their differences in the somatic 
(O) and flagellar (H) antigens, the Salmonella bacteria are 
classified into serotypes by the KaufTmann- White scheme [3]. 
Initially, the Salmonella serotypes were treated as individual 
species each having a Latinized scientific name such as Salmonella 
typhi and Salmonella typhimurium, but in the 1 980s all Salmonella 
serotypes were combined into one species [Salmonella enterica [4]) 
or two species [Salmonella enterica and Salmonella bongori [5]) as 
sero logical variants (serovars [6]) due largely to the extraordinarily 
high genetic similarity among them, which has caused confusion in 



research and clinical settings. Indeed, all Salmonella serotypes 
have very similar genetic backgrounds as revealed by DNA-DNA 
re-association [7], comparison of genome structures [8,9] and 
genomic sequencing [10-12], but on the other hand they may 
differ radically in pathogenic properties. For example, whereas 
many Salmonella serotypes may cause self-limited gastroenteritis 
(such as S. typhimurium, S. enteritidis, etc.) or may be virtually 
non-pathogenic to humans, a few may elicit potentially fatal 
systemic infections, such as S. typhi that causes typhoid [13]. The 
dynamic and confusing Salmonella taxonomy reflects long lasting 
uncertainties about the phylogenetic status oi Salmonella: do they 
dwell in nature as discrete genetic clusters of organisms or as 
phenotypic variants of a single (or two) natural species with a 
continuous spectrum of genetic divergence among them? 
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Figure 1. Diversity of cleavage patterns with Avrll among 5. 

gal/inarum wWd type strains. Lanes: 1, molecular size marker (XDNA 
concatemer); 2, RKS5078; 3, SGSC2423; 4, SGSC2292; 5, SGSC2293; 6, 
R1481; 7, R1482; 8, R1483; 9, SARB21; 10, 287/91. S. pullomm RKS5078 
(Lane 2) is included here for a comparison with the S. gallinarum strains. 
doi:10.1371/journal.pone.0103388.g001 

To examine this issue, we have tested two hypotheses: first, that 
all Salmonella serotypes form a common gene pool in which DNA 
exchange occurs readily so that each member has an equal chance 
to become a different pathogen (e.g., infecting a different host 
species or causing a different disease) by acquiring appropriate 
genetic material and incorporating it into the genome; and second, 
that each Salmonella type (e.g. a serological or pathogenic type) is 
already an established biological unit, members of which have a 



common and highly stable genome structure as a result of natural 
selection over long evolutionary time. 

If the first hypothesis is correct, all Salmonella serotypes should 
be combined into just one species. If the second hypothesis is 
correct, each Salmonella type is a genetically well-defined natural 
species. The first hypothesis would be supported by demonstration 
of a continuous spectrum of genetic divergence among different 
Salmonella types and, conversely, the second hypothesis would be 
validated by demonstration of clear-cut genetic boundaries among 
different Salmonella types as a result of genetic isolation and 
independent accumulation of mutations over long evolutionary 
time. Findings that support either hypothesis wiU lead to novel 
insights into the population structure of Salmonella and the 
mechanisms of divergence that have occurred during their 
adaptation to different environments (e.g., a particular host) 
during their evolution. A key step towards an answer is to elucidate 
whether the individual Salmonella types can be grouped into 
discrete, well separated genetic clusters. The classical method for 
Salmonella differentiation is serological typing, but a serotype may 
be polyphyletic. For example, the antigenic formula of 6,7:c:l,5 is 
common to multiple distinct pathogens, e.g., S. paratyphi C, S. 
choleraesuis and S. typhisuis, which infect different hosts or cause 
different diseases. Furthermore, based on serotyping only, one 
cannot judge whether the Salmonella serotypes are genetically well 
isolated from one another or whether some might be genetic 
"intermediates" between other serotypes. 

Recently, we provided evidence showing that Salmonella exist 
in discrete genetic clusters isolated by clear-cut genetic boundaries 
[14]. However, that work was based on whole genome analysis. To 
further test the robustness of the genetic boundary concept in 
delineating Salmonella into clearly defined natural lineages (e.g., 
species), we sampled a small subset of conserved genomic DNA 
sequences, i.e., the endonuclease cleavage sites that contain the 
CTAG sequence such as TCTAGA for Xbal. As enteric bacteria 
tend to eliminate the short sequence CTAG by the Very Short 
Patch (VSP) repair mechanism [15], endonuclease cleavage sites 
containing CTAG are scarce and highly conserved in Salmonella. 
We found that profiling of the CTAG-containing cleavage 
sequences could resolve the genetic boundaries as reliably and 
efficiendy as whole genome analyses but with enormously reduced 
requirements for time and resources. 

Results 

Monophyletic Salmonella serotypes have highly 
conserved cleavage patterns by CTAG-containing 
endonucleases 

It has been well documented that wild type strains of a 
monophyletic Salmonella serotype exhibit highly similar endonu- 
clease cleavage patterns for Xbal and Blnl/Avrll on PFGE, such 
as S. typhimurium [16], S. typhi [17] or S. paratyphi A [18], in 
comparison with the diverse cleavage patterns seen in polyphyletic 
serotypes such as S. pa7'atyphi B [19] . However, when we looked at 
S. gallinarum, known as a monophyletic Salmonella serotype, we 
saw considerable diversity of cleavage patterns among wild type 
strains for the endonucleases that have CTAG in the cleavage 
sites, as illustrated by Avrll cleavage in Figure 1. To determine 
whether the diversity of cleavage patterns was created by 
nucleotide base changes (leading to addition or deletion of 
cleavage sites) or by genomic rearrangements (changing the 
lengths of DNA fragments between the cleavage sites), we 
compared the genome structures of these strains. Analysis of 
incomplete TCeuI cleavage products of the bacterial genomes 
showed that these strains had their genomes rearranged in several 
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Figure 2. Genomic rearrangements of S. gallinarum strains. (A), PFGE patterns of incomplete l-Ceul cleavage cleaved genomic DNA. Lanes: 
same as in Figure 1; (B) Genome maps based on l-Ceul data in (A). As seen here, wild type strains have the seven l-Ceul fragments organized 
differently, vj/ith six genome types being resolved among the 8 strains of S. gallinarum. The map of S. pullorum RKS5078 is presented here for a 
comparison with the S. gallinarum strains. 
doi:1 0.1 371/journal.pone.01 03388.g002 



ways by recombination between rrn operons (Figure 2; for details 
about I-Ceul and nn-mediated genomic rearrangements, see 
[8,20]), suggesting that at least part of the diverse cleavage patterns 
have resulted from genomic rearrangements. 

Next, we needed to determine whether the genomic rearrange- 
ments have just altered the lengths between parrs of Avrll sites or 
might have disrupted any of the Avrll cleavage sites (it is highly 
unlikely that genomic rearrangements may create new CTAG- 
containing cleavage sites). For this, we compared two represen- 
tative S. gallinarum strains, SARB21 and 287/91 (Figure 3), 
which were previously mapped [21] or sequenced [22], respec- 
tively. We analyzed the genome maps of the two strains by 
matching the homologous cleavage sites between them for Xbal 
and Avrll, in addition to I-Ceul. We found that, as expected, most 
of the cleavage pattern differences between S. gallinarum 
SARB21 and 287/91 could be accounted for by two inversions 
(one between rrnH and rrnG and one between rrnD and rrnC) 
and one translocation (I-Ceul Fragment D), all of which massively 
altered the lengths of homologous genomic DNA segments flanked 
by the CTAG-containing endonuclease cleavage sites (Figure 4). 
The rrnH-rrnG inversion made Xbal Fragments C and I to join, 
forming Fragments C'+I' and 'C+'I (Xbal C391 and 1248 missing 



and Xbal C'+I' 614 and 'C+'I 25 appearing in strain SARB21 
relative to 287/91), along with corresponding changes in Avrll 
cleavage (See Figures 3 and 4). The I-Ceul Fragment D 
translocation and rrnD-rrnC inversion resulted in Xbal Fragment 
B533 splitting to B' and 'B, with B' joining H' to become B'+ 
H'488, and a truncated 'B160 fusing with 'H+F to create a 483 kb 
segment, along with corresponding changes in Avrll cleavage (See 
Figures 3 and 4). The only unique Avrll cleavage site is present in 
strain 287/91 at about 3250 kb from gene thrL (Figure 4, 
indicated by the open arrowhead), probably as a result of a 
mutation in the corresponding Avrll cleavage site in strain 
SARB2 1 rather than creation of an Avrll cleavage site in strain 
287/91. 

Conservation of CTAG-containing endonuclease cleavage 
sites within other representative Salmonella serotypes 

To assess the extent of conservation of the CTAG-containing 
endonuclease cleavage sites, we conducted systematic comparisons 
of the cleavage locations on the genome for Xbal among strains of 
representative Salmonella serotypes, numbering the cleavage sites 
sequentially according to their locations on the genome of S. 
typhimurium LT2. Cleavage sites present in any strains but not in 
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Figure 3. Xbal and Avrll cleavage patterns of 5. gallinarum strains 287/91 and SARB21 after PFGE separation. (A) Xbal cleavage. Lanes: 
1, SARB21; 2, 287/91; 3, >.DNA as molecular size marker. (B) Avrll cleavage. Lanes: 1, X,DNA as molecular size marker; 2, SARB21; 3, 287/91. Letter 
designations are for strain 287/91; the same letters are used for homologous fragments in strain SARB21. In the designation of fragments in SARB21, 
C means a fragment homologous to C in 287/91 but truncated on the right-hand part by genomic rearrangement, and 'C means truncation on the 
left-hand part of the fragment. 
doi:1 0.1 371/journal.pone.01 03388.g003 



LT2 were not numbered. As exemplified by the six S. typhimurium 
strains, th(^ Xbal cleavage sites were highly conserved within a 
Salmonella lineage, consistent with the findings by the PFGE 
techniques. LT2 has 27 Xbal cleavage sites numbered Xbal 1-27 
(Table 1), most of which were conserved among all six compared 
S. typhimurium strains. Of particular significance, as many as over 
one third of the 27 Xbal cleavage sites fell in intergenic sequences, 
strongly suggesting the potential importance of these sequences. 
Among the six X. typhimurium strains, we found two kinds of 
differences in Xbal sites: presence/absence and presence/degen- 
eracy. The non-conserved Xbal cleavage sites have largely 
resulted from recent insertions such as prophages or phage 
remnants (Supplementary Table SI). The sequence degeneracy of 
the Xbal cleavage sites can be illustrated by Xbal 9, which was 



present in LT2 but not in any of the other five S. typhimurium 
strains due to nucleotide substitution, changing the Xbal cleavage 
site TCTAGA to TCCAGA and leading to the replacement of 
leucine in LT2 by proline in the other five S. typhimurium strains. 

Within each of the other Salmonella serotypes analyzed, the 
CTAG-containing cleavage sites were also highly conserved, with 
the main differences among the wild t5'pc strains being additional 
cleavage sites in prophages or genomic islands (Supplementary 
Table SI). For example, S. heidelherg SL476 had three large 
genomic islands, 58, 30 and 42 kb in size, respectively, all 
containing multiple Xbal cleavage sites; the 42 kb island, present 
in S. heidelherg SL476 but not in S. Heidelberg B182, contained as 
many as seven additional Xbal cleavage sites within a 20 kb region 
(Supplementary Table SI). Other endonucleases (e.g., Spel) 
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Figure 4. Physical map comparison between 5. gallinarum strains 287/91 and SARB21. The nnap of SARB21 was reported previously [21]; 
here letter designations for the cleavage fragments of SARB21 have been changed according to the homologues in strain 287/91 for the convenience 
of comparison. Note that all Xbal, l-Ceul and Avrll (maps from top to bottom) cleavage sites are conserved in the two strains except the Avrll site 
between fragments F and J in 287/91 (open arrow), which is missing from SARB21. Lines with solid arrowheads at both ends indicate the ranges of 
genomic inversions via rm-mediated recombination between the two strains and filled arrows indicate recombination sites that have resulted in the 
translocation of l-Ceul fragment D. 
doi:1 0.1 371/journal.pone.01 03388.g004 



having CTAG in the cleavage sites had similar situations as Xbal 
(data not shown). The overall conservation of the CTAG- 
containing endonuclease cleavage sequences in the Salmonella 
genomes makes it possible to use these endonucleases for the 
identification of Salmonella isolates. For this, the distinctness of 
cleavage patterns of endonucleases with CTAG in the cleavage 
sequences across different Salmonella serotypes (or lineages; a 
monophyletic Salmonella serotype is equivalent to "a Salmonella 
lineage" but a polyphyletic Salmonella serotype contains two or 
more Salmonella lineages) would have to be documented. 

CTAG endonuclease cleavage patterns are distinct across 
Salmonella lineages 

Across the 13 Salmonella serotypes analyzed, cleavage patterns 
for the endonucleases that contain CTAG in the cleavage sites 
were drastically different and the sites at different genomic 
locations also had different levels of conservation; here we take 
Xbal cleavage as an example to illustrate the levels of conservation 
of the CTAG-containing sequences at different genomic locations. 
First of all, the Xbal cleavage sites within the tRNA encoding 
sequences had the highest level of conservation among the 13 
Salmonella serotypes and even E. coli strain K12 as illustrated 
previously (Fig. 4 in [23]). Of great interest, Xbal 3 within an 
intergenic sequence (between STM1377-STM1378) is also 
conserved among the 13 Salmonella serotypes and E. coli strain 
K12; the potential biological function encoded by this genomic 
region is now under scrutiny. Xbal 4, 16 and 17, located in 
intergenic sequences between STM1622-STM1623, STM3405- 
STM3406 and STM3443-STM3444, respectively, are conserved 
in all analyzed Salmonella strains; characterization of these 
intergenic sequences for their potential roles in bacterial biology 
might provide novel insights into the evolution of bacteria. Xbal 



26 in STM4362 (/i/ZX) is conserved in aU analyzed Salmonella 
strains, and Xbal 7, located in an intergenic sequence between 
STM2394— STM2394, is conserved in all Salmonella subgroup I 
strains analyzed here. Most other Xbal cleavage sites are specific 
either to one or a subset of Salmonella lineages (Supplementary 
Table SI). Spel and other endonucleases having CTAG in the 
cleavage sites had similar general patterns as Xbal (data not 
shown). The distinct profiles of the CTAG-containing endonucle- 
ase cleavage sequences among the Salmonella serotypes make it 
possible to use these enzymes for delineating Salmonella into 
genetically well defined natural clusters, which would have to be 
further validated by comparisons between CTAG-containing 
cleavage site profiling and genome sequence information. 

Distinct CTAG-containing cleavage profiles to delineate 
Salmonella into natural lineages: correlation with core 
genome-based phylogenetics 

The high levels of conservation of the CTAG-containing 
cleavage sequences as exemplified by the distinct Xbal cleavage 
patterns in different Salmonella lineages suggest that profiling of 
such sequences may be used to delineate Salmonella into discrete 
natural lineages. To validate this, we conducted hierarchical 
clustering analysis on the Xbal cleavage profiling data among the 
Salmonella strains (Supplementary Table S2). Based on this 
analysis, we constructed a phylogenetic tree (Figure 5) and 
compared it to the core genome-based tree (Figure 6); the two 
trees revealed essentially the same phylogenetic relationships 
among the Salmonella strains. 

Discussion 

In this study, we sampled a tiny portion of highly conserved 
sequences of the Salmonella genome, i.e., the CTAG-containing 
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Figure 5. Phylogenetic tree constructed with tlie Xbal cleavage data based on numbers of conserved sites sKiared by subsets of the 
bacteria; B. 
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endonuclease cleavage sequences, as genomic signatures to probe 
the genetic uniqueness of individual Salmonella lineages and 
further test our hypothesis that bacteria dwell in nature as discrete 
genetic clusters. Findings from this may help evaluate and validate 
the genetic boundary concept, which is the core of our hypothesis. 
The highly similar genetic backgrounds in sharp contrast to the 
radical pathogenic differences among Salmonella make this genus 
of bacteria an ideal model for testing the hypothesis and for the 
studies of pathogenic evolution that turns benign organisms into 
infectious agents. 

The topic on bacterial diversification, evolution and speciation 
has been a focus of extensive discussions, especially by investigators 
viewing from different angles and using different methods [24- ,3 2]. 
Originally, we initiated this work on the comparison between S. 
typhimurium and S. typhi, the former causing self-limiting 
gastroenteritis but the latter eliciting deadly typhoid fever in 
humans, to look for distinct genomic features that can be used to 
unambiguously divide them into discrete bacterial clusters, which, 
if demonstrated to exist, we call "natural species", as they should 
be clusters of bacteria ("species") formed by natural selection. We 



recendy recognized and characterized clear-cut genomic diver- 
gence between them [33], which we defined as the genetic 
boundary. Such genetic boundaries have been documented in a 
broad range of bacteria, such as Yesinia and Slaphylococcus [14]. 
In this study, we demonstrate that the selected subset of highly 
conserved sequences could reveal the genetic boundaries as clearly 
and rehably as whole genome analyses. 

Compared to the whole genome strategies, CTAG-containing 
sequence profiling for Salmonella has several advantages. First, 
CTAG-containing cleavage sequence profiling by PFGE requires 
much less time and resources than genome sequencing strategies 
but still provides adequate information to delineate Salmonella 
into discrete genetic clusters, which is especially important when 
very large numbers of bacterial strains are involved; and second, 
the collection and analysis of CTAG-containing sequence data 
profiled by PFGE can be conducted in virtually any molecular 
biology laboratory equipped with the PFGE apparatus. Addition- 
EiUy, like whole genome sequences, the CTAG-containing cleavage 
sequence profiles are also objective and can be compared between 
laboratories and between platforms used. One case to be pointed 
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Figure 6. Phylogenetic tree constructed with concatenated core genome sequences, with the numbers beside the nodes indicating 
bootstrap values. 
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out here is that monophyletic Salmonella serotypes like S. 
gallinarum may have diverse PFGE patterns (Fig. 1) of cleavage 
by Xbal or other endonucleases that have CTAG-containing 
cleavage sites, which may reduce the value of CTAG-containing 
endonuclease cleavage sequence profiling. However, even in such 
cases, well over 50% of the cleavage bands on PFGE are similar 
among the wild type strains, so creating no ambiguity. 

We chose profiling the CTAG-containing endonuclease cleav- 
age sequences to probe the Salmonella genomes for their genetic 
distinction also because it is a very useful and efficient method for 
a broad range of studies. For example, in addition to delineating 



the bacteria into discrete genetic clusters (i.e., natural species), 
which is our primary objective of this study, the profiling has a 
particular advantage in tracking the evolutionary scenarios of the 
Salmonella lineages, because the CTAG-containing sequences, 
though highly conserved in Salmonella, have been in the process 
of being eliminated from the genome by the VSP repair 
mechanism [15]. Assuming that all remaining CTAG-containing 
sequences through natural selection should be very important, we 
anticipated to see the gradual degeneracy processes of the CTAG- 
containing sequences among Salmonella as a whole. Specifically, 
the levels of conservation of the CTAG-containing sequences can 
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Table 2. Bacterial strains used in this study". 





strain 


Accession number'' 


Reference 


S. typhimurium LT2 


AE006468 


[23] 


S. typhimurium 14028S 


CP001363 




S. typhimurium SL1344 


FQ3 12003 




S. typhimurium D23580 


FN424405 




S. typhimurium ST4/74 


CP002487 




S. typhimurium UK/1 


CPa02614 




S. typhi Ty2 


AE014613 


[17] 


S. typhi CTl 8 


NC_0a3198 




S. typhi P-stx-12 


NC_016832 




S. paratyphi A ATCC9150 


CP000026 


[18] 


S. paratyphi A AKU_12601 


Fi\/1200053 




S. paratyphi C RKS4594 


CP000857 


[12,36] 


S. agona SL483 


CP001138 




S. duW/n CT_02021853 


CP001144 




S. duW/n SD3246 


CM001151 




S. er)teritidis PI 251 09 


AM933172 




S. pullorum RKS5078 


CP003047 


[21,37] 


S. gallinarum 287/91 


AM933173 


[22] 


S. gallinarum SGSC2423 


N/A 




S. gallinarum SGSC2292 


N/A 




S. gallinarum SGSC2293 


N/A 




S. gallinarum R1481 


N/A 




S. gallinarum R1482 


N/A 




S. gallinarum R1483 


N/A 




S. gallinarum SARB21 


N/A 




S. choleraesuis A50 


CM001062 




S. choleraesuis SC-B67 


AE017220 




S. heidelberg B182 


NC_017623 




S. heidelberg SL476 


CP001120 




S. newport SL254 


CP001113 




S. schwarzengrund CVM 19633 


CP001127 




S. arizonae RKS2980 


CPa00880 




S. arizonae RKS2893 


CP006693 




S. hongori NCTC 12419 


FR877557 




S. bongori RKS3044 


CP006692 





*^See more detailed information on these bacterial strains at www.ucalgary.ca/— kesander. 

^N/A means that the bacterial strain is not sequenced and the genome sequence was not needed for this study. 
doi:1 0.1 371/journal.pone.01 03388.t002 



be Stratified by comparing their presence and degeneracy status 
(substitution of any of the CTAG nucleotides by transition or 
transversion) among the Salmonella hneages. For example, five 
Xbal cleavage sites are conserved not only across all Salmonella 
lineages compared in this study but also in E. coli (Supplementary 
Table SI). Other Xbal cleavage sites are either conserved among 
the Salmonella lineages but not in E. coli, or among Salmonella 
subgroup I lineages but not in other subgroups, or among strains 
of the same lineage, or specific to only particular strains of even the 
same lineage (in such cases, they are mostly in prophages or 
genomic islands). The differential profiles of the CTAG-containing 
cleavage sequences make each of the Salmonella lineages unique 
for identification, and the different patterns of sequence degener- 



acy among the Salmonella lineages (Supplementary Table S2) may 
provide important clues for their strategies in adapting to different 
environments (e.g., different host species). 

Based on our results, we speculate the following evolutionary 
scenario that makes a small subset of highly conserved sequences 
to remain as a reliable and informative genetic signature of 
individual lineages. During the long process of CTAG elimination 
[15], each Salmonella lineage (dwelling m its own gene pool, [32]) 
accumulates nucleotide substitutions independentiy, leading to 
gradual degeneracy of the CTAG sequences in a particular way 
specific to each of the Salmonella lineages. Detailed analysis of the 
substituting and substituted nucleotides during the process of 
CTAG sequence degeneracy should provide novel insights into the 
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strategy and mechanisms during tlie adaptation process of 
individual Salmonella pathogens, especially regarding their inter- 
action with the host that they infect. We conclude that CTAG- 
containing sequence profiling can be used to unambiguously and 
efiiciendy delineate Salmonella into distinct genetic lineages, 
which are equivalent to the natural species of bacteria. 

Materials and Methods 

Bacterial strains 

Bacterial strains used in this study along with the accession 
numbers of the sequenced genomes, are hsted in Table 2; more 
detailed information on these bacteria can be found at the 
Salmonella Genetic Stock Center (http://www.ucalgary.ca/ 
~kesander/). Bacteria were grown overnight at 37°C with shaking 
in Luria-Bertani (LB) broth or on LB plates. Stock cultures were 
stored at — 70°C in LB broth with 25 'X) glycerol. 

Reagents and PFGE analyses of genomic DNA 

I-Geul, Xbal and Avrll were purchased from New England 
Biolabs, and proteinase K was from Roche. Most other reagents 
were from Sigma. Bacterial genomic DNA isolation, endonuclease 
cleavage with I-Ceul, Xbal and Avrll, and separation of the 
cleavage fragments were described previously [8,17,34]. Briefly, 
PFGE was used to separate DNA fragments cleaved by the 
endonucleases, and I-Ceul partial cleavage was used to lay out the 
overall genome structure of bacteria. PFGE was done in a CHEF 
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DR II electrophoresis system (BioRad) at 5.6 V/cm with 
0.5 xTBE buffer as the running buffer. 

Genomic and statistics analysis tools 

We determined the phylogenetic relationships of the bacteria 
based on their differences in the numbers of conserved CTAG- 
containing endonuclease cleavage sites common to subsets of 
Salmonella strains or sequence identity of genes common to them 
using the neighbor-joining (NJ) method, and the tree construction 
was done widi MEGA4.0.2 [35] and CLUSTALW. The statistical 
analyses were performed by using software SPSS v20. 
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